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The graph theoretic concept of maximal independent set arises in several practical problems in 
computer science as well as in game theory. A maximal independent set is defined by the set 
of occupied nodes that satisfy some packing and covering constraints. It is known that finding 
■ minimum- and maximum-density maximal independent sets are hard optimization problems. In 

' this paper we use cavity method of statistical physics and Monte Carlo simulations to study the 

corresponding constraint satisfaction problem on random graphs. We obtain the entropy of maximal 
QQ ■ independent sets within the replica symmetric and one-step replica symmetry breaking frameworks, 

' shedding light on the metric structure of the landscape of solutions and suggesting a class of possible 

algorithms. This is of particular relevance for the application to the study of strategic interactions in 
social and economic networks, where maximal independent sets correspond to pure Nash equilibria 
of a graphical game of public goods allocation. 
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I. INTRODUCTION 



' It is well known that an important family of computationally difficult problems are concerned with graph theoretical 
^ concepts, such as covering and packing Among them, the problem of finding a minimum vertex cover has probably 
' , become the typical example of NP-hard optimization problems defined on graphs Q and it has recently attracted a 
^ lot of attention in the statistical physics community for its relation with the physics of spin glasses [3]. In particular, 
Q in statistical mechanics the vertex cover problem is usually studied in its dual representation of hard-core lattice gas, 
, where coverings are mapped into particles of unit radius that cannot be located on neighboring vertices. In graph 
theory, such a dual configuration defines an independent set, i.e. a set of vertices in a graph no two of which are 
\ adjacent. More formally, given a graph Q — (V,£) a subset I C V of vertices is an independent set if for every pair 
^ . of vertices i,j G I the edge {i,j) ^ £. The size of an independent set is the number of vertices it contains. So, 
^\ ' given a graph Q with N vertices and an integer M < it is reasonable to ask if it is possible to find an independent 

■ set of size at least M . This decisional problem was proved to be NP-complete and its optimization version, i.e. 
CO finding an independent set of the largest size {maximum independent set), is NP-hard [H, From the definition it is 

, straightforward to notice that the complement of an independent set is a vertex cover and finding a minimum vertex 
t — ' cover is exactly equivalent to find a maximum independent set. 

' We are here interested to a slightly different graph theoretic problem, dealing with the concept of maximal inde- 
pendent set (mIS), that is an independent set that is not a subset of any other independent set. This means that 
adding a node to a maximal independent set would force the set to contain an edge, contradicting the independence 
constraint. Again, by removing a vertex from an independent set we still get an independent set but the same is 
not true for maximal independent sets. In this sense, mISs present some property typical of packing-like problems 
^ . that makes the corresponding optimization problem quite different from the widely studied vertex cover problem. In 

■ particular, it turns out that mISm are actually the intersection of independent sets and vertex covers. See Fig. [l]for 
some examples of mISs in a small graph. 

A maximal independent set can be easily found in any graph using simple greedy algorithms, but they do not 
allow to control the size of the mIS. As for the independent set problem, the complexity increases if we are asked to 
find maximal independent sets of a given size M and, in particular, the problems of finding maximal independent 
sets of maximum and minimum size are NP-hard. Note that as well as a maximum maximal independent set (MIS) 
(maximum independent set) is equivalent to a minimum vertex cover, a minimum maximal independent set (mis) is 
often addressed as minimum independent dominating set 

Apart from the purely theoretical interest for problems that are known to be computationally difficult, finding 
maximal independent sets plays an important role in designing algorithms for studying many other computational 
problems on graphs, such as /c-Coloring, Maximal Clique and Maximal Matching problems. Moreover, distributed 
algorithms for finding mISs, like Luby's algorithm , can be applied to networking, e.g. to define a set of mutually 
compatible operations that can be executed simultaneously in a computer network or to set up message-passing based 
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communication systems in radio networks [5|- One recent and interesting application is in microeconomics, where 
maximal independent sets can be identified with Nash equilibria of a class of network games representing public goods 
provision Hence, studying maximal independent sets allows to understand the properties of Nash equilibria in 

these network games. 

Methods from statistical mechanics of disordered systems turned out to be extremely effective in the study of 
combinatorial optimization and computational problems, in particular in characterizing the "average case" complexity 
of these problems, i.e. studying the typical behavior of randomly drawn instances (ensembles of random graphs) [8| 
that can be very different from the worst case usually analyzed in theoretical computer science. Following the standard 
lattice gas approach, we represent maximal independent sets as solutions of a constrained satisfaction problem (CSP) 
and provide a deep and extensive study of their organization in the space {0, 1}^ of all lattice gas configurations. 

On general graphs the number of maximal independent sets is exponentially large with the number of vertices iV, 
therefore an interesting problem is to compute their number as a function of their size M . On random graphs, this 
number can be evaluated using different methods. An upper bound is obtained analytically using simple combinatorial 
methods, whereas a more accurate estimate is given by means of the replica symmetric cavity method. These methods 
allow to compute the entropy of solutions (i.e. of maximal independent sets) as a function of the density of coverings 
p = M /N and are approximately correct in the large N limit. 

There are density regions in which no maximal independent set can be found, that correspond to the UNSAT 
regions of the phase diagram of the associated CSP. We study the structure of this phase diagram as a function 
of the average degree of the graph {K for random regular graphs and z for Erdos-Renyi random graphs). Since the 
replica symmetric (RS) equations (belief propagation) are not always exact, we study their stability in the full range of 
density values p and discuss the onset of replica symmetry breaking (RSB). This is expected, because both problems of 
finding minimum and maximum mISs are NP-hard. The two extreme density regions are not symmetric and different 
behaviors are observed; in particular while RS solutions are stable in the low density region they are unstable in the 
large density region. For random regular graphs with connectivity if = 3, general one-step replica symmetry breaking 
(IRSB) studies show a dynamical transition in the low density region and a condensation transition at high densities 
(accompanied by a dynamical one). In the latter case, the complexity remains zero in both the RS and IRSB phases. 

The theoretical results valid on ensembles of graphs, are tested on single realizations using different types of 
algorithms. Greedy algorithms converge very quickly to maximal independent sets of typical density, around the 
maximum of the entropic curve. By means of Monte Carlo methods, as well as message passing algorithms, we 
can explore a large part of the full range of possible density values. As expected, finding maximal-independent sets 
becomes hard in the low and high density regions, but each algorithm stops finding mISs at different density values. 

The paper is organized as follows: In the next section we recall some known mathematical results and discuss related 
works in computer science, economics, and physics; Section III is instead devoted to present some rigorous results 
on the spatial organization of maximal independent sets (in their lattice gas representation). We develop the cavity 
approach in Sec. IV, with both the RS solution and the discussion of the replica symmetry breaking in regular random 
graphs. From Sec. V we move our attention to the problem of developing algorithms to find maximal-independent sets 
of typical and non- typical size (i.e. density p). We present a detailed analysis of greedy algorithms, and put forward 
several different Monte Carlo algorithms, whose performances are compared with one of the best benchmarks used 
in CSP analysis, the Belief Propagation-guided decimation. Conclusions and possible developments of the present 
analysis with application to computer science and game theory are presented in Section VI. 

Some more technical results are reported in the Appendices. In Appendix A we compute upper bounds for the entropy 
of maximal independent sets in random regular graphs and Poissonian random graphs using a combinatorial approach 
in the annealed approximation (first and second moment methods); while Appendix B contains a proof of the results 
in Section [IIII In Appendix C we discuss the Survey Propagation and criticize its results. Appendix D is devoted to 
the important calculations for the RS and IRSB stability analysis. In Appendix E we shortly describe the Population 
Dynamics algorithm that is used to solve some equations in the paper. In Appendix F we analyze separately the two 
constraints defining a mIS, i.e. neighborhood covering and hard-core particle packing problems that approximate (from 
above) the correct entropy in the low and high density regions respectively. Finally, in Appendix G we briefiy consider 
the problem of enumerating mISs of higher order, that is also relevant to study stable specialized Nash equilibria of 
some network games Q. 

II. RELATED WORK AND KNOWN RESULTS 

This section is devoted to collect and briefly summarize a large quantity of results and works directly or indirectly 
related to maximal independent sets in graphs. For simplicity we have divided them depending on their fields of 
application. 







FIG. 1: (Color online) Example of 4 maximal-independent sets (mIS) out of the 11 possible ones for this graph with 9 vertices. 
Vertices labeled with 1 belong to the mIS. 



A. Computer Science and Graph Theory 



A first upper bound for tlie total number of maximal independent sets was obtained by Moon and Moser (1965) 
showing that any graph with N vertices has at most 3^/"^ mISs ^]. The disjoint union of iV/3 triangle graphs is the 
example of a graph with exactly 3^/'^. On a triangle-free graph, the number of mISs is instead bounded by 2^/^ 

In case we are interested to the number of mISs of a fixed size M, tighter bounds are available. In relation to 
coloring, it was shown by Byskov 11] that the number of mISs of size M in any graph of size N is at most 
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while the number of mISs of size < M is at most ^^m-n ^n-sm These upper bounds can be compared with those 
presented here and obtained using non-rigorous methods (see Tables W HU^ . 

Despite these mathematical results, counting maximal independent sets is mainly a computational problem. In 
particular, listing all mISs of a given graph and retaining the ones of maximum and minimum size allows to solve 
the maximum independent set and minimum independent dominating set problems, that are NP-hard. Again, a list 
of all mISs can be exploited to find 3-coloring of a graph and to solve other difficult problems fisj . For this reason, 
researchers have studied algorithms to list all maximal independent sets efficiently in polynomial time per output 
i.e. polynomial delay between two consecutively produced mISs 



set, i.e. polynomial delay between two consecutively produced mISs [14lj . Even faster is the celebrated randomized 
distributed algorithm proposed by Luby, and based on message-passing technique, that works in O(logiV) time [4]. 
The main problem is that the number of mISs is in general exponential with the number of vertices N, therefore the 
total time required to list all maximal independent sets is still too large for computational purposes. 

Another important result is the inapproximability within any constant or polynomial factor of the associated opti- 
mization problem. Indeed, a branch of computer science is interested in proving the existence of algorithms that find 
in polynomial time suboptimal solutions to optimization problems that are NP-hard. These algorithms can be proved 
to be optimal up to a small constant or a polynomial factor (see e.g. [IBl)- When this can be done, the optimization 
problem is approximable. For the minimum vertex cover it is possible to find a cover that is at most twice as large as 
the optimal one, therefore MVC is approximable with constant factor 2. On the contrary the maximum mIS problem 
(i.e. finding a MVC configuration but maintaining the "maximality" condition at every step) is not approximable 
within any constant or polynomial factor p^ . This result shows that maximality condition is not a minor detail and 
actually has a profound impact on the properties of the computational problem. 



B. Economics and Game theory 



The lattice gas configurations associated with maximal independent sets are Nash Equilibria of a discrete network 
game, called Best Shot Game, recently studied by Galeotti et al. Q and previously introduced for the case of complete 
networks by Hirshleifer [l3|- The Best-Shot game is a toy-model for the type of strategic behaviors that emerges in 
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many social and economic scenarios ranging from information collection to public-goods, i.e. wherever agents are 
allowed to free ride and exploit the actions of their peers. 

A more general game theoretic framework for the allocation of public goods on a network structure was proposed 
by Bramoulle and Kranton |6l| . In their game, agents are located on the nodes of a network and have to decide about 
the investment of resources for the allocation of some public goods. An agent can purchase the good for a fixed cost 
c, cooperate with neighboring agents sharing a lower individual investment (i.e. < c), or free ride possibly exploiting 
a neighbor's investment. Two classes of equilibria exist: specialized equilibria, in which agents either pay all cost c or 
free ride, and mixed equilibria where cooperation is present. However only specialized equilibria are stable and for this 
reason we can limit the analysis to the discrete case with only two actions: action 1 (full cost investment) and action 
(free-riding) Q- In the Best Shot game agents possible actions are restricted to these two options. Independent sets 
arise because agents receive positive externalities, i.e. they prefer not to contribute if at least one of their neighbors 
already does. The independent sets are maximal because the contribution is a dominant action, i.e. one is better 
off by contributing if none of the neighbors does. The set of Nash equilibria of the Best Shot game on a graph Q is 
exactly the set of maximal-independent sets of Q. 

In the case of public goods the objective function of a Social Planner is to find optimal Nash equilibria, that are 
Nash equilibria maximizing the sum of individual utilities by minimizing the global cost. We have seen that finding 
MIS and mis is an NP-hard optimization problem, therefore we expect that self-organizing towards such optimal 
equilibria is an equally difficult task for a network of economic agents. It is thus of great importance to develop simple 
mechanisms to trigger optimization and study how such mechanisms could be implemented in realistic situations for 
instance by means of economic incentives [l8| . 

It is worth no ting that recently some techniques from statistical physics have been applied to the study of the 
Best Shot game [l9j The work by Lopez-Pintado [l^, indeed, put forward a mean-field theoretical analysis of the 
best- response dynamics that provides an estimate of the average density of contributors (action 1) in Nash Equilibria 
on general uncorrelated random graphs. In Section |Vl we will discuss the relation between best-response dynamics 
and another algorithm (the Gazmuri's algorithm) that can be used to find Nash Equilibria (i.e. mIS). 



C. Physics 

The hardcore lattice gas representation allows to map maximal independent sets on the solutions of a constraint 
satisfaction problem. As mentioned in the introduction, similar CSPs have been recently studied in the statistical 
mechanics community to model systems with geometrical or kinetic constraints, and exhibiting a glass transition. 

Kinetically constrained models [20,] are used as simple, often solvable, examples of the glass transitions. For instance, 
the Kob- Andersen model [2l| is a lattice gas with a fixed number of particles in which a particle is mobile if the number 
of occupied neighbors is lower than a given threshold, mimicking the cage effect observed in super-cooled liquids. At 
sufficiently large density of particles, the system can be trapped in some blocked configuration, in which all particles 
are forbidden to move. Though the model is dynamical and satisfies constraints of a kinetic nature, the number 
of blocked states depending on the parameters of the system can be studied with a purely static, thermodynamic 
approach (a la Edwards) [23|. These calculations, performed using the transfer matrix methods, the replica method 
and numerically by thermodynamic integration, have been applied to several models exhibiting dynamical arrest, such 
as the Fredrickson- Andersen model [23], the zero-temperature Kawasaki exchange dynamics |24l. [2511 or urn models 
[26j , and recently extended to the study of Nash Equilibria of the Schelling's model of segregation [23] . 

Studying the dynamical arrest only by means of thermodynamic methods leads to underestimate the role of the 
dynamical basins of attraction of different blocked configurations (Edwards measure vs. dynamical measure). On 
the contrary, these methods become exact or approximately correct in problems with constraints due to geometric 



frustration, like hard-core lattice gas models [28[ and hard-sphere packing problems [29[ . As mentioned in the Intro- 
duction, the simplest hard-core lattice gas model is the dual of the vertex covering where the close-packing limit (high 
density regime) corresponds exactly to the minimum vertex cover problem (i.e. maximum independent set problem). 
On random graphs with average degree z > e, the vertex covering problem presents frustration and long-range cor- 
relations, that are responsible of replica symmetry breaking [s^]- Moreover, the numerical detection of hierarchical 
clustering in the solution's landscape suggests the existence of levels of RSB higher than IRSB 'sT]. 

A first attempt to model a purely thermodynamically driven glass transition in a particle system was propose d by 
Biroli and Mezard (BM), that studied a lattice glass model on regular lattices [s^ and random regular graphs [33j ] . 
in which configurations violating some locally defined density constraints are forbidden. More precisely, a particle 
cannot have more than £ among its k neighbors occupied. At zero temperature, the constraints become hard and 
the model is effectively a CSP defined on a general graph. The BM model is very similar to our problem as shown 
by mapping back particles onto uncovered vertices and vacancies onto covered ones. In a configuration defining a 
maximal independent set, every uncovered node has at most fc — 1 uncovered neighbors, because at least one of them 
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has to be covered. Therefore, our model is similar to a BM model with £ = k — 1. However, a further constraint on 
covered nodes is present, requiring vacancies to be completely surrounded by particles. Maximal independent sets are 
thus similar to lattice glass configurations in the close packing regime, but the existence of the packing-like constraint 
prevents the statistical mechanics of the two models to be exactly the same. A generalization of BM model with 
attractive short rang interactions has been studied in Ref. ^34 ] . 




Finally, a recent paper by Tarzia and Mezard [35| puts forward a model of Hyper Vertex Covering (HVC) that is an 
abstraction of the Group Testing procedures. The model can be defined on a random regular bipartite factor graph 
with N variables and M constraints and consists in finding a cover of the variables (that are 1 or if covered or 
uncovered respectively) subject to the condition that the sum of the variables involved in each constraint is at least 
1. This hypervertex cover constraint is somewhat similar to the maximal-independent set constraint that requires at 
least one of the nodes in the set composed by a node and its neighborhood to be 1. On the other hand, maximal- 
independent sets are defined on real graphs instead of factor graphs and there is a further packing-like constraint (no 
neighboring Is are allowed). 

In conclusion, similar problems are current matter of investigation in the statistical physics community [sgI ] , but 
in our opinion the mIS problem is different from all them and substantially new due to the presence of two local 
constraints with contrasting effects. 



A preliminary account of the statistical properties of maximal independent sets can be obtained from simple but 
rigorous mathematical analyses providing i) lower and upper bounds for the existence of mISs with density p of covered 
nodes, and ii) information on their spatial arrangement in the space of all possible binary configurations on a graph. 

We have anticipated in the previous sections that in general it is not possible to find maximal independent sets 
at any given density p of covered nodes. This is due to the confiicting presence of covering-like constraints (favoring 
higher densities) and packing-like constraints (favoring lower densities). We thus expect to find mISs only within a 
finite range of density values [pmin, Pmax]- On random graphs, lower bounds p^fn^ for the minimum density value 
and upper bounds p^ax" fo'^ the maximum density value can be easily obtained by means of annealed calculations 
employing the first moment method (see Appendix|Xl. These values are reported in Table|T]for the case of Erdos-Renyi 
random graphs and in Table HI] for random regular graphs (RRG), i.e. graphs in which connections are established in 
a completely random way with the only constraint that all nodes have the same finite degree K . 

In addition to the results of these combinatorial methods, very general results on the structure and organization 
of mISs can be obtained rigorously starting from the lattice gas representation. (We refer to the Appendix [B] for 
the proofs of all propositions contained in the present section.) An independent set / is a configuration with binary 
variables cr,; e {0, 1}, in which ai = 1 if the node i belongs to the independent set (i £ /) and Ui = if the node is not 
part of it (i G /^). We are interested only in those independent sets that are maximal and we focus on the behavior 
of single variables as well as sets of variables. We first introduce some useful concepts. 

Consider a set S of solutions of the mIS problem, that is a set of configurations a;, with the property of being a mIS 
for a given graph Q. A variable ai is frozen in the set S if it is assigned the same value in all configurations belonging 
to S. Therefore, by hipping this variable we cannot obtain a configuration with the property of being in the same 
set S. For matter of convenience we classify these kind of variables in a hierarchical way. The variable ai is locally 
frozen if we can obtain a configuration belonging to S by flipping Ui and at most a number n of other variables aj 
with strictly n — o{N). The variable ai is instead globally frozen if, in order to have another configuration in S, we 
have to flip ai and 0{N) other variables. 

In the following, we consider S to be the set of all maximal independent sets in a given graph Q . It is straightforward 
to show that (see App. |B|) 

Proposition 1 If a configuration is a maximal independent set, then all variables ai \/i = 1, . . . N are at least locally 
frozen. 

Proposition [1] shows that the Hamming distance between two mIS configurations is at least 2, but does not specify 
if this is the actual minimum in every graph and, more importantly, which is the maximum possible distance between 
two mISs. A result in this direction is obtained by considering the propagation of variable rearrangements induced by 
a single variable fiip. Suppose that at step t = a. variable ai is hipped from to 1, then some of the local constraints 
on the neighboring variables become unsatisfied and these variables have to be flipped too. At step t = 1, we flip all 
variables giving contradictions and identify all other constraints that now become unsatisfied. We proceed in this way 
until all variables satisfy their local constraints and the configuration is again in S. This iterative operation is usually 
called best-response dynamics in game theory Q. In many CSPs, a similar iteration would not converge rapidly to a 




III. MAXIMAL INDEPENDENT SETS: EXISTENCE AND ORGANIZATION 
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A) 



B) 




FIG. 2: (Color online) Examples of rearrangements induced in a mIS configuration by a single spin flip: A) from I (black) to 
(white), B) from to 1. 

new solution due to the presence of loop-induced frustration and long-range correlations. In such cases, variables are 
globally frozen, because their value depends on the value assumed by an extensive number of other variables. On the 
contrary, in the case of maximal independent sets, all rearrangements involve a finite number of variables. 

Proposition 2 If a variable at of a mIS is flipped and the variables (including Ui itself) are updated just once by 
best-response dynamics, the outcoming configuration is still a maximal independent set. Moreover: A) If a variable 
CTj of a mis is flipped from 1 to Q the best-response dynamics is limited to the neighborhood of the node i. B) If 
instead the variable ai is flipped from to 1, the best-response dynamics extends at most to the second neighborhood ofi. 

An example of the validity of Prop. [2] is reported in Fig. [2j In A) we flip the top vertex from 1 (black) to (white) 
and rearrange the rest of the mIS configuration. The right neighbor is forced to flip — )• 1, but her neighbors are 
thus they do not flip. The rearrangement propagates up to the neighbors of the top node. In B) the top node is 
flipped from to 1. The left neighbor flips to 0, thus leaving the neighboring node unsatisfied. This node fiips to 1, 
but all her neighbors are already satisfied at 0. The rearrangement propagates here up to the second neighborhood. 

The number of variables flipped during the best-response dynamics depends on the topological properties of the 
underlying graph. When the flrst and the second moments of the degree distribution are finite. Prop. [5] implies that 
all variables are only locally frozen, and starting from a mIS it is always possible to find another mIS within a finite 
number of spin flips. This is not always true for scale-free networks with power-law degree distribution pk oc k~'' and 
7 < 3. In these networks, the second moment of pk diverges with the system's size N, therefore a single variable flip 
can induce the rearrangement by best-response of an extensive number of other variables. An example is provided by 
star-like graphs in which the only two possible maximal independent sets are the one with the central node covered 
and all other nodes uncovered and the complementary configuration. It is easy to see that the best-response dynamics 
following a single fiip always extends to the whole graph and implies the rearrangements of 0{N) variables, that are 
globally frozen. 

Let us exclude extremely heterogeneous star-like structures, by focusing on homogeneous graphs with bounded 
degrees, i.e. Vi ki < const <^ N. Even in case of networks in which maximal independent sets are locally frozen, these 
mis are all connected by best response dynamics, as proven by the following result. 

Proposition 3 Given a pair of maximal independent sets I and I' , it is always possible to go from I to X' with a 
finite sequence of operations, each one consisting in flipping a single variable from to I followed by a best-response 
rearrangement. 

This statement proves exactly the connectedness of the space of maximal independent sets under an operation that 
allows to move on that space. In the thermodynamic limit, when the distance between two mISs can be taken of 
0{N), there is a chain of at most 0{N) operations to go from one mIS to the other, and each intermediate step 
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z 


Pmin 


BP 
Pmin 


BP 
Pmax 


upper 
Pmax 


3 


0.253 


0.301 


0.561 


0.631 


4 


0.216 


0.244 


0.504 


0.564 


5 


0.190 


0.208 


0.461 


0.511 


6 


0.170 


0.183 


0.425 


0.468 


7 


0.155 


0.165 


0.396 


0.432 


8 


0.143 


0.150 


0.370 


0.403 


9 


0.132 


0.138 


0.350 


0.377 


10 


0.124 


0.129 


0.331 


0.355 



TABLE I: Erdos-Renyi Graphs of average degree z: Comparing lower and upper bounds p^^n'^ , P^ET obtained with the first 
moment method, with the values that Behef Propagation (BP, see Section V) equations predict Pmf^, P^ax- 



corresponds to a mlS that differs from the previous and the following ones in just a finite number of variables (due 
to a finite rearrangement). 

In summary, we have uncovered two important properties of the space of mISs: 1) all variables in a mIS are locally 
and not globally frozen, i.e. the minimum Hamming distance between mISs is at least 2; 2) the space of mIS is 
connected if we move between mIS using a well-defined sequence of operations. It means that all mISs form a single 
"coarse-grained" cluster when the appropriate scale is chosen, that is that of best response rearrangement operation. 
How does this picture change when we take into account mISs at fixed density p of occupied nodes? It is not clear, 
because in the rearrangement that follows each single variable's flip, the number of covered nodes changes of a finite 
amount, causing only a negligible variation in the density p. On the other hand, the length of the sequence of these 
operations can be of 0{N), therefore two mISs at the same, very low or very high, density could be connected by 
a very long path that gradually overpasses some density barrier. Since the proof of connectedness cannot be easily 
extended to the space of mISs at fixed density p, we cannot exclude the onset of some clustering phenomena at very 
low or very high values of p. In the next section we will try to address this question by means of statistical mechanics 
methods. 



IV. PHASE DIAGRAM BY THE CAVITY METHOD 

Some of the information obtained in the previous section are now compared with the results of a statistical mechanics 
analysis [H, |39| by means of the zero-temperature cavity method [1^. At the level of graph ensembles, there is a 
sharp difference between random regular graphs and ER random graphs. In random regular graphs, all nodes behave 
in the same way, thus the cavity equations simplify considerably leading to simple recursion equations for a small set 
of probability marginals. The case of ER random graphs is more involved as the presence of various degree values 
implies the use of distributions instead of simple cavity fields already at the replica-symmetric (RS) level. In both 
classes the solutions of the replica symmetric cavity equations are not always stable (at least for some values of the 
average degree). For random regular graphs we consider the 1-step replica symmetry breaking (IRSB) scenario to see 
how glassy phases (if any) change the RS picture close to minimum and maximum densities. 

However, cavity equations can be applied to study single instances as well (i^, leading to message-passing recursive 
algorithms able to find efficiently maximal independent sets in a wide range of density values on very general kinds 
of graphs (see Section |VC|) . 

A. Replica Symmetric results: belief propagation 

The problem of finding a maximal independent set of density p can be mapped on the problem of finding the ground 
state of a particular kind of spin model or lattice gas on the same graph. On each vertex i we define a binary variable 
Ci — {0, 1}. For a configuration a_ = {ai\i = 1, . . . ,iV} to be in the ground state (i.e. a mIS), each variable i has to 
satisfy a set of fci -I- 1 constraints involving neighboring variables. There are ki constraints hj on the edges emerging 
from i, each one involving two neighboring variables i and j, and one further constraint Ii on the whole neighborhood 
of i. For two neighboring nodes i and j, the edge constraint /.y = 1 iff = V cTj =0 {packing-like constraint)] 
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K 


lower 
Pmin 


BP 
Pmin 


„BP 


BP 


BP 
Pmax 


upper 
Pmax 


3 


0.233 


0.264 


- 


0.425 


0.458 


0.649 


4 


0.20 


0.223 


- 


0.381 


0.419 


0.578 


5 


0.177 


0.196 


- 


0.349 


0.387 


0.522 


6 


0.160 


0.175 


- 


0.324 


0.360 


0.476 


7 


0.147 


0.159 


0.166 


0.303 


0.338 


0.439 


8 


0.135 


0.146 


0.157 


0.285 


0.319 


0.408 


9 


0.126 


0.136 


0.150 


0.272 


0.301 


0.382 


10 


0.118 


0.127 


0.144 


0.258 


0.287 


0.359 



TABLE II: Random Regular Graphs of degree K: Comparing lower and upper bounds pmi^"^, p^ix^ , obtained with the first 
moment method, with the values that Belief Propagation (BP, see Section V) equations predict Pmiri, Pmax- For p < pf-^ we 
need damping to converge BP equations while for p > pf,^ BP equations do not converge even with damping. 



while the neighborhood constraint li = 1 iff ct,; + X^jeOi^'j > (covering-like constraint). Here di represents the set 
of neighbors of node i. 

The zero temperature partition function corresponding to this constraint-satisfaction problem reads 

^(^) = En^'('^-'^e,) n (2) 

in which agi = {aj \j G di} and /it is a chemical potential governing the number of occupied vertices. Assuming that 
the graph is a tree, we can write exact equations for the probability of having configuration (ci, {crk\k € di \ j}), 
on node i and its neighbors except for j, when constraints Ij, li and lij are absent (cavity graph). We denote this 
probability Vi^j {fJi , <Ji-^j ) and write 

uk^, kedi\j 

where cFi^i = {(Tfc|fc £ di \j} and Zi^j is a normalization constant. The equations, called Belief Propagation (BP) 
equations [4l|, are exact on tree graphs, but can be used on general graphs to find an estimate of the above marginals. 
They give a good approximation if the graph is locally tree-like, i.e. there are only large loops whose length diverges 
with the system's size, as for random graphs of finite average degree. Beside this, in writing the BP equations we 
assume that only one Gibbs state describes the system, i.e. we have Replica Symmetry. 

For the present problem, the BP equations simplify considerably if we write them in terms of variables R]j^m = 
Vi^j{cri,m) in which m is the number of occupied neighboring nodes of i (in the cavity graph). Looking at the BP 
equations one realizes that a configuration satisfying all constraints (i.e. solution of the BP equations) contains only 
three kinds of these variables: 1) the probability that a node is occupied in the cavity graph and all its neighbors are 
empty r[^-' — R\^'' , 2) the probability that a node is empty as well as all its neighbors r}^-' — -Rq^'', and 3) the 

probabihty that a node is empty but not all neighbors are empty r^^'' — X]m=i ^o~m- terms of these variables, 
the RS cavity equations become 

rp^" cx e-^ n (l-^i'"')' (4) 

kedi\j 

rV' « n (l-'-oV^)- n ^0^' 

kGdi\j kGdi\j 

rlt' « n ''S^^ 

kedi\j 

With the correct normalization factor, these equations can be solved by iteration on a given graph. 
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FIG. 3: (Color online) BP entropy s{p) of maximal independent sets for random regular graphs with different values of 
K — 2,3,4,5 (curves from right to left). 



1. Random Regular Graphs 

In random regular graphs with degree K, the equations do not depend on the edge index i ^ j ai their fixed point, 
and we have 

roo 

The zero temperature partition function counts the number of ground states of the system, i.e. the number of mIS, 
weighting each occupied vertex with a factor e~'^. The corresponding free energy is defined as 

^~^Nf{^) ^z= j dpe^^'^P^-'^^P (6) 





e-''(l - 










+ (1 - roo 




(1 




'o 




e-*'(l - 


'o 


+ (1 - roo 

-1 


K-l 


e-^(l - 




+ (1 - roo 


K-l 



in which we have decomposed the sum over mIS configurations in surfaces at the same density of occupied sites 
p, isolating the entropic contribution at each density value s{p). The knowledge of the free energy /(^) allows to 
compute by Legendre transform the behavior s{p) of the entropy of mIS as a function of the density of occupied 
vertices, that can be compared with the results obtained by means of the annealed calculation (Appendix In the 
Bethe approximation, the free energy can be computed as 



N " N 

i (i,i)e£ 



where 



e 



e 



E h^Vj^,{Gj,Gj^,i) (8) 

fi.fj ,0"ai\j ■'^Oj\i 

and in terms of the variables {ri, vq, roo} 

e-MA/. ^ + „^if (10) 

g-M/., ^ r2 + 2ri(ro+roo) (11) 
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FIG. 4: (Color online) RS entropy s{p) for random regular graphs of various degree K — 3 ~ 10 (curves from right to left) and 
size A'' — 10'* . The curves are plotted only in the intervals of density in which BP equations converge. 



Moreover in these variables the density is easily written as 

e-^(l -ri)^ 



-P(l-ri)^^ + (l-roo^^ 



(12) 



Once we have solved the BP equations ([S]), we have p and /(/^) and the Legendre transform fJ.f{fJ.) — 
— maxp [s{p) — fip], so we can compute the entropy s(p) by inverting it. 

We have solved the BP equations numerically and plotted the corresponding s vs. p diagrams in Fig|3]for several 
values of the degree K — 2, 3, 4, 5. 

At this point one should check the stability of BP equations at the fixed point. This stability is important to 
have convergence and find reliable values for the marginals. Notice that BP stability is not a sufficient condition 
for the problem to be in the RS phase, but a necessary one, i.e. if BP are unstable, then we can conclude that RS 
assumption is not correct anymore [ssj . In general, two kinds of instabilities can occur: a ferromagnetic instability 
(or modulation instability), associated with the divergence of the linear susceptibility and signaling the presence of 
a continuous transition toward an ordered state; and a spin-glass instability, associated with the divergence of the 
spin-glass susceptibility and signaling the existence of RSB and possibly a continuous spin-glass transition. In the 
present case, since we are dealing with models defined on random graphs, the first kind of instability is excluded and 
we focus on the latter one. 

The details of stability analysis are reported in Appendix [Dl Called Am the maximum eigenvalue of the stability 
matrix M, the stability condition is A = {K — 1)A|^ < 1. In random regular graphs this happens as long a.s p < p^^ ■ 
For larger densities the BP equations do not converge even if we use a linear combination of old and new messages to 
stabilize the dynamics (like in learning processes). On the other side, for p < p^_^ we have A < 1 but for K > Km = 6 
the algorithm converges only if we stabilize it. The numerical values for different K are given in Table HTl 

In Fig. [3]we plotted again the curves of the RS entropy as function of p for various degree values, now showing only 
the region of the curve in which the RS solutions are stable. It is worthy noting that for K < Km = 6 BP equations 
converge very easily in the low density region while they are always unstable for K > Km = 2 in the high density 
region. 



2. Erdos-Renyi Random Graphs 

Consider the ensemble of ER random graphs with degree distribution pk and average degree z. In this case the 
probabilities r = {ri, ro, roo} depend on edge index (i — )■ j): 



11 



0.25 



0.2 



0.15 



0.1 



0.05 



z=3 
z=4 
z=5 
z=6 
z=7 
z=8 
z=9 
z=10 



0.1 



0.2 




0.3 
P 



0.4 0.5 



0.6 



FIG. 5: (Color online) Entropy of random ER graphs of average degree z = 3 — 10 (curves from right to left) and size A'' = lO"* 
in the range of densities in which BP equations converge. 



rp^'cxe-^ n (l-r-r^), (13) 

n (1-4"^)- n 



^0 



rot'^ n -o^^ 
keai\j 

One can run the above equations on random graphs to obtain the BP entropy. Figure [5] displays the results that we 
obtain in this way. There are some regions in which the equations do not converge even if we use a linear combination 
of new and old messages to stabilize the equations. 

Let us denote the above BP equations by BV. In a general random graph messages r change from one directed 
edge (i — > j) to another. In a large graph (or equivalently in the ensemble of random graphs) the statistics of these 
fluctuations is described by P{r) which satisfles 

PiD = E / n dP{r')6{r - BV), (14) 
fe •' 1=1 

where qk = {k + l)pk+i/z is the excess degree distribution. 

To obtain the entropy, we have to solve the above equation by means of population dynamics (see App. [E|. This 
is a well known method to solve equations involving probability distributions |40|. Having P{r) we compute the free 
energy in the Bethe approximations fif ~ ^{/S.fi) — -^/i(A/ij), with 

k 1=1 \ I I I I 

-/x(A/,,) = / dP{r_)dP{r:) In [r^r'^ + ri(r^ + r'^^) + r[{T^ + roo)) , 



(15) 



and finally invert the Legendre transform. The density of covered nodes is given by 



= I WdPir') -k ^''^^-f-'^ 

r J l\ e-A'ntr(i-ri)+nti(i--^o)-nLi-[, 
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FIG. 6: (Color online) Entropy of random ER graphs of average degree 2 = 3 — 10 (curves from right to left) obtained with 
population dynamics. 



Notice that in this way we obtain the entropy in the RS approximation for infinite random graphs. The numerical 
values are already in good agreement with those obtained on single samples of size N — 10'*. It should be mentioned 
that in the population dynamics algorithm we do not have the convergence problem, we could always stabilize the 
dynamics and find the entropy in the whole region of densities. Figure [5] displays the results of population dynamics 
for the entropy. The lower and upper bounds in Table U have been obtained with population dynamics. 



B. Replica Symmetry Breaking 



Here we go beyond the RS ansatz and explore the possibility that maximal independent sets organize in clusters or 
in even more complex structures [38| . The instability of BP equations that we observed in the previous section could 
be a clue of a transition into a IRSB or full RSB (fRSB) phase. This would be reasonable at least for the high density 
region because the maximum mIS (MIS) coincides with the minimum vertex covering for the conjugate problem, that 
is presumed to present RSB of order higher than 1 (possibly fRSB) (sol. Isil. l42j . 

In order to verify these ideas we consider IRSB solutions of the mIS problem. 

Having a IRSB phase means that the solution space is composed of well separated (distances of order N) clusters 
of solutions. Each cluster has its own free energy density fc and there are an exponential number of clusters of given 
free energy e^^^^'' where S(/) is called complexity. The physics of this phase is described by the following generalized 
partition function: 



Here m is the Parisi parameter that describes IRSB phase. Having the distribution of cavity fields V{r^^^) among 
the clusters we write the following expression for ■!> in the Bethe approximation ^45i] : 

y7V$ = ^yA$,;- ^ yA$,„ (17) 

where y = mfj, and 

e-«^*' = J Y[ dP{r^^')e-y^^% (18) 
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FIG. 7: (Color online) Comparing RS entropy (dashed line) and E(m — 1) (circles) in the low density region. 



The cavity fields distribution satisfies 



V{r'^^)(x J Yl dV{r^^')e-y^f>'^^5{r'^i -BV). 



(19) 



where we have assumed that the graph is regular and so ^'(r'^^ ) does not depend on the edge label. We have also 
introduced the cavity free energy change 



g-M/.^. = e-^(l - ri)^-i + (1 - roo)^-^ 
From the generalized free energy $ we can obtain the complexity by a Legendre transform 



(20) 



= -y* + 2//, 



/ = 



9y$ 
dy ■ 



(21) 



A simplifying approach would be that of working in the limit of Survey Propagation (SP) [43], assuming infinite 
chemical potential fi — >■ ±oo and zero Parisi parameter m — >■ 0, with finite ratio y = m/z. This means that we focus 
only on the most numerous clusters (m — 0) composed of frozen solutions (/i ±cxd). However, this is not consistent 
with the spatial organization of mISs emerging from Section IlIII Indeed, we know that variables are locally but not 
globally frozen, and the absence of globally frozen variables means that if there is any cluster of solutions it contains 
only unfrozen variables (46j . Therefore we do not expect to find any physically relevant result by means of Survey 
Propagation. In Appendix [Cl we report a detailed analysis showing that SP complexity is indeed unphysical. 

Let us relax the m = restriction to see if there is any other type of clusters. Notice that when m is finite the 
chemical potential can also be finite and can not have frozen variables in the clusters (i.e. variables assuming always 
the same value for all solutions in one cluster) . A nonzero complexity in this case would imply the presence of unfrozen 
clusters. 

For a value of m the relevant clusters are those that maximize S(/) — mfj,f. The parameter m itself is chosen to 
maximize the free energy $(m). As long as we are in the RS phase, m* — 1 clusters are the thermodynamically 
relevant ones with zero complexity. A dynamical transition could occur if the complexity of these clusters S](m = 1) 
takes a nonzero value. But a real thermodynamic transition (IRSB) occurs when the relevant clusters have again a 
zero complexity with m* < 1. 

Here we resort again to population dynamics (see App. |E]) to solve Eq. IIV Bl and to find the complexity as described 
above. For simplicity we only consider the case of random regular graphs with K — 3. We have used a population 
of size Np — 10^ to represent the distributions. To get the nontrivial fixed point of the dynamics we start from 
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FIG. 8: (Color online) Comparing RS entropy (solid line), IRSB entropy (triangles) and E(m — 1) (circles) in the high density 
region. 

completely polarized messages [13] and update the population for T = 10^ iterations to reach the equilibrium. In each 
iteration all members of the population are updated in a random sequential way. After this equilibration stage we 
take M = 10^ independent samples of the population to compute the free energy 4> and other interesting quantities 
like the complexity. 

In Figs. [7] and |8] we report the m = 1 complexity obtained with population dynamics and compare it with the RS 
entropy in the extreme density regions. 

There is a small interval in the low density region in which S](m = 1) is positive, signaling a dynamical transition in 
that region. Since the RS solution is stable in this region, total entropy will be equal to the RS one and the minimum 
density is the point that m — 1 complexity vanishes. Despite the large equilibration time we have still large statistical 
errors maybe due to the instability of this solution. We observed that the errors improve very slowly by increasing 
size of population or number of samplings, indicating a poor convergence of this solution. 

In the high density region we observe a completely different behavior where S(m = 1) is always non-positive. This 
behaviour is also observed in other problems like 3-SAT [48|, and it means that in the high density region dynamical 
and condensation transitions coincide [i^ . Below the transition, complexity is zero and we are in the RS phase with 
m* = 1; for larger densities, E(m = 1) becomes negative and we have a condensation transition where m* < 1. The 
entropy computed at this value of m gives the IRSB prediction of the entropy. The maximum density that we obtain 
in this way is displayed in Fig. [8l We recall that a maximum mIS is complement of a minimum vertex covering and 
we already know that these coverings have a nonzero entropy j30l , |4^ . 

We found qualitatively the same behavior for random regular graphs of degree K — A,h. 

C. Distance from a solution 

In Section Hill we have proved that in a general graph of size N a solution (mIS) is at a finite distance 0{{k'^)) 
from a number of 0{N) other solutions (mIS). The mathematical proof is based on the idea that by flipping a single 
variable in a mIS configuration we generate a rearrangement process that propagates at most to the second neighbors 
of the flipped variable. Here we substantiate this result by means of a statistical mechanics calculation. 

In the large deviations cavity formalism, it is possible to compute the number of solutions of a CSP at a distance 
d from a given one a* using the weight enumerator function (45l . [sol . [5l| 

Z = e-^^^" =^n^' U /.,e-^^.(^'---)' = /'e^[^'('^)-^'^l. (22) 
2 » {ij)ee ■''^ 
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FIG. 9: (Color online) Comparing analytic and numerical results for s(d) in random regular graphs of degree K = and 
size A*' = 10^. The points have been obtained by computing s(d) for a given solution using the cavity method. Max density 
(squares) and Min density (triangles) refer to the best extreme solutions that we obtain. 



where s(d) is the entropy of solutions at a distance d — jj J^ii'^i ^ '^lY from ct*. Averaging over all solutions (nilS) 
Z = e-^^^ = — Vyn^« TT j^^.e--E.(-.--r)^-ME.< ^ /e^I^W-'^l, (23) 

Zn — ^ — ^ J. J. J. i / 

2: 2. i {ij)e£ " 

where we added a chemical potential /i to keep track of the contribution of mISs with different density. Notice that 
here we are using the annealed approximation which works if there are no strong fluctuations in Z . We find / in the 
Bethe approximation in terms of cavity fields (now weighted with the term e~^('^»^'^») ) and extract the expression 
s{d) of the entropy of solutions at an Hamming distance dN from another by means of Legendre transform, 

df 

— xf ^ ma,x[s{d) — xd], d = f + x—. (24) 

For the mIS problem, the cavity equations have now to take into account the current value of the variable tr; and that 
in the reference configuration a* , 

l/,^j((T„(T,^j>*) OCe-"("*-"-)'-'^"* ^ II IkI^kl^k^^{ak,CJk-,^K) (25) 

(^k^i k^d'i\j 

Knowing / and d we can numerically compute s{d), as reported in Fig[9]for the case K ~ ?>. The plot shows that 
at typical densities there is always an extensive number of mISs at any finite distance d from another mIS. The same 
holds for non- typical values of the density (obtained with 7^ 0), even if in this case we can push our analyses only 
up to those values at which BP converges. From the figure we observe that a low density solution is closer to other 
solutions than a high density one whereas typical solutions lie in between. 



V. NUMERICAL SIMULATIONS 



In this section we discuss different classes of numerical simulations that can be used to investigate the properties of 
maximal independent sets on random graphs. We first consider some greedy algorithms, that generate a mIS in a time 
that scales linearly with the system's size. These algorithms work in a very limited range of density values, though 
they are of interest for their application to the study of the best-response dynamics in strategic network games d, 0] . 
A more effective way to explore non typical regions of the phase diagram, at low and high densities, is by means of 
Monte Carlo simulations based on the lattice gas representation. Monte Carlo methods are also useful to obtain an 
estimate of the entropy of maximal-independent sets by thermodynamic integration. Other interesting results can be 
obtained by means of a particular kind of zero-temperature Monte Carlo simulation, that allows transitions from a 
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FIG. 10: (Color online) Average density of coverings p (circles) vs. average degree K, z in the mISs obtained with Gazmuri's 
algorithm on ER random graphs (open symbols) and random regular graphs (full symbols). Data points were obtained averaging 
over 100 graphs oi N = 10^ nodes. The dashed lines are the theoretical predictions obtained solving the corresponding 
differential equations for the concentration of covered nodes. The typical behavior of greedy algorithms is compared with 
typical results of Monte Garlo simulations on the same graphs (diamonds). 



mis directly to another and have been conceived appositely for sampling the space of maximal-independent sets. 
Finally, we compare the results of Monte Carlo simulations with those obtained by a completely different technique, 
the numerical decimation method based on belief-propagation equations (BP decimation, BPD). Both Monte Carlo 
and BP decimation are effective in sampling mISs in the regions of low and large density values, but they are not able 
to reach the extreme density limits predicted by theoretical calculations (RS bounds). 

Note that the use of a simulated annealing scheme makes the computational time of all MC simulations much longer 
than that of BP-based algorithms. Therefore, whereas in the case of BP decimation we present results for systems of 
N = 10^ nodes, all MC results will be restricted to smaller size {N = 10'^) in order to have a reasonable statistics in 
particular at non-typical densities. 



A. Greedy algorithms and Best Response dynamics 

The simplest algorithm to generate maximal independent sets is the Gazmuri's algorithm (s^ . 

0. Start assuming all nodes of the graph to be 0; 

1. At each time step select a node i uniformly at random, assign 1 to the node i and remove it from the graph 
together with all its neighbors (that are valued) and all the edges departing from these nodes. 

2. Repeat point 1. until the graph is empty. 

The configuration of the removed nodes defines a maximal independent set for the original graph. The Gazmuri's 
algorithm works in linear time on every graph, but does not allow any control on the density of covered nodes; 
therefore one could expect to obtain on average solutions of typical density, close to the maximum of the entropic 
curve s{p) in Fig. 

ER random graphs are particularly simple to study theoretically. We assume that the degree distribution remains 
poissonian during nodes removal, but with a time-dependent average degree. This is reasonably correct if the random 
removal is an uncorrelated process. 

The dynamics of the Gazmuri's algorithm can be expressed in terms of differential equations for concentrations, using 
a standard mean-field approach recently formalized in probability theory by Wormald [53]. The initial number of 
nodes in the original graph is A'', but at each step of the process one node and all its neighbors are removed. If we 
call c(r) the average degree of a node in the graph after T temporal steps, the expected variation of the number 
of nodes in the graph is E[N{T + 1) - N{T)] = -1 - c{T). Writing N{T) = Nn{T/N), we get the concentration 
law fi{t) = —c{t) — 1. Note that the average degree evolves as c{t) — c{0)n{t) with c(0) = z. The two equations 
give c{0)n{t) — (c(0) + l)e^'^(")* — 1, that vanishes at tf = log (c(0) + l)/c(0). As we cover only one node per unit 
of time, p{tf) — tf. The Gazmuri's algorithm on ER random graphs of average degree z produces mIS of typical 
density pgaz = p{if) = log (z -I- 1)/^. In Figure [TU] we show that this result is in good agreement with simulations 
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done averaging over 100 ER graphs of size N = 10^ and corresponds approximately to the typical density, even if it 
systematically overestimates the values maximizing the RS entropy (see also Fig. [T^ . 

The probability of obtaining a non typical mIS with the Gazmuri's algorithm decays exponentially like e^^'' where 
dp is the deviation of the density of covered nodes with respect to the typical density pgaz- In principle, repeating 
an exponential number of times the Gazmuri's algorithm, we have non zero probability to find rare trajectories in 
which the final density of covered nodes may differ considerably from the typical one. These finite size effects can be 
quantified using the following path-integral approach 0, [s^ . 

The evolution of the algorithm is fully specified by the evolution of the concentration of covered nodes x{t), the 
concentration of the number of untouched nodes n{t) or equivalently the evolution of the average degree c{t). The 
probability that in the T + 1*'' temporal step of the algorithm the number of covered nodes and untouched nodes 
changes respectively of AX and AiV is 



P^+\AX,AN) = e-^(*) 



°° c(t)'' 

SaX.I ^ SAN,-k-l ^, 
k=0 



(26) 



where we have used the Poisson degree distribution p}^ = "^^P^ ^^^^ • -^^ Fourier space 

oo oo 

P^+\m,m} = E E PT^\AX,AN)e-''^^'^^''-'^^'^^'' (27) 

AX=-oo AN=-oo 

= exp [-c(i) + i^(<) - ie(i) + c(t)e*''(*) . (28) 
Then considering NAT consecutive steps and neglecting subleading terms in At we get 

P^+^^iAX, AN) ^ ^^^^.it)AN+^m^x 1^^^ ^ „ ^ c(Oe*''(*)) } (29) 

where one should then use AX ~ Nx{t)At and AiV ~ N c{t) At / c{0) . The probability P{xf\c) of a trajectory 
{x{t),c{t)} given the initial and final conditions {a;(0) — 0, c(0) = z} and {x{tf) Xf = tf,c{tf) = 0} is 

Pi^\c) ^li r ^ r ^ f dx{t) f dc{t) exp {^NAtC{x{t).m.<t),c{t), C(i), A^(t))} (30) 

with Lagrangian 



£(i, c, X, c, p) = c(t) -ip{t) + i^{t) ~ ce^^(*' - i^{t)x{t) - ip{t) - (31) 

z 

The Euler-Lagrange equations are 

a; = 1, 

i = 0, 

c = -zil + ce'") 
ip = z(e'^-l) (32) 

Solving the equations, the probability of rare events becomes P{x\c) ~ exp (iVX(x, c)) with the large deviation 
functional given by the saddle-point action I{x,c) = J^^ dtC{x,c). We have solved numerically the Eqs. [32] and 
computed the large deviation functional X(p, z) for ER graphs with different values of the average degree z. In Fig. 
miA we have plotted the behavior of the large deviation functional for z = 4 as a function of the density p of covered 
nodes in the mIS generated by the algorithm. Its theoretical behavior is compared to the results of simulations of 
Gazmuri's algorithm on a ER graph with N — 100, 200, 500, 1000 nodes and same average degree K = A. The results 
are in perfect agreement. 

In the case of random regular graphs, it is possible to obtain an analytical estimate of the behavior of the Gazmuri's 



algorithm using an approach based on random pairing processes [55|- A random pairing process is used to generate 
random graphs with a given degree distribution and consists in taking a number of copies of the same node equal 
to its degree and in matching these copies randomly with copies of other nodes until the network is formed and 
no copy remains unmatched. The evolution of mIS can be described using a random pairing process. We consider 
two quantities: the number of covered nodes (i.e. that is equal to the time T) and the number of nodes that still 
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FIG. 11: (Color online) Rare events in the Gazmuri algorithm for ER random graphs (left) and random regular graphs (right) 
with z,K = 4. The theoretical behavior of the large deviation functional T{p) (dashed line) is compared with results of 
simulations for graphs with A'' = 100 (circles), 200 (squares), 500 (triangles), 1000 (crosses) nodes. Data points are obtained 
computing the probability P(p) of observing mIS of density p out of 10^ trials. Plotting the rescaled function logP(p)/A*', we 
find perfect agreement with the theoretical values for I{p) . 



have not been touched by the algorithm, i.e. Y{T). All copies of the nodes are initially untouched: at each step, 
we select randomly an untouched copy and cover her together with all her siblings. The K other copies matched 
with these ones are removed {exposed in Ref. (sBj). The result is that at each step the number of untouched copies 
decreases of 2K, but some of the remaining copies are siblings of exposed ones. This is important in order to 
compute the total variation of the number of untouched nodes during the process. In fact, the probability that 
the pairing of a selected copy is also untouched is given by the number of untouched copy KY(T) divided by the 
total number of non-exposed copies KN — 2KT. This random pairing algorithm is repeated until there are no 
untouched nodes anymore. As we are considering the pairing on the fly, on an annealed network structure, we can 
neglect the evolution of the degree distribution and write an equation for Y(t). Its variation in a single time step 
is E[Y{T + 1) - Y{T)] = -1 - Y{t)/{N ~ 2T). The dynamics of y{t) = Y{T/N)/N is governed by the differential 
equation 



dv{t) ^ ^ yjt) 
dt l-2t 



(33) 



that gives y{t) 



(g-l)(l-2t)^/^-(l-2t) 
K-2 



The density of covered nodes is given by the time t/ at which y{tf) = 0, i.e. 



tf — ^ — ^{K — l)2/(2--ft^)^ Figure [TU] (full simbols) compares the theoretical prediction for various values of K with 
the average density observed in the simulation of the Gazmuri's algorithm on random regular graphs of size N = 1000. 
In a small network, a single realization of the Gazmuri's algorithm can deviate considerably from the average behavior, 
even if the degree distribution is initially regular. The fluctuations are now associated with the number of possible 
untouched nodes exposed by the pairing process in a time step. This can be easily quantified applying the path- 
integral method to obtain the large deviation functional. The probability that in the T + I*'' temporal step of the 
algorithm the number of untouched nodes changes of Ay is 



K 



P^+i(AF) = ^<5Ay,-i-„ 
In Fourier space it becomes 



r!=0 



/ Y{T) 



n \N-2T 



1 - 



Y{T) 
N -2T 



K-n 



(34) 



+00 K 



AF=-oo n=0 



K\ ( Y{T) 



n J \N -2T 

K 



AT OT^ \ ' 



N -2T 



Y{T) 
N -2T 



K- 



-ie,{T)AY 



(35) 
(36) 



19 




FIG. 12: (Color online) Entropy curves s{p) obtained with BP on both ER (left) and random regular graphs (right). On the 
curves we report the minimum and maximum densities at which we can find mIS using different algorithms: the Gazmuri's 
algorithm (black plus symbols), standard Monte Carlo (red crosses), rearrangement Monte Carlo (blue circles), Monte-Carlo 
with chemical potential fixing the average p (red squares), fixed density Monte Carlo (green triangles), BP decimation (black 
diamonds). Data are obtained for graphs with A*' — 10^ nodes and average degree z,K — 4. 



The corresponding Lagrangian in the continuum limit is 

^(y, y, = -^m ~ ^mm - k log 

Imposing the stationarity, we find the saddle-point equations 



1 - 2t 



y 



K{e'i - 1) 



1 - 2< + y(e»4 - 1) 



-1 



l-2t + y{e'i - 1) ' 



(37) 

(38) 
(39) 



Solving the equations and computing the large deviation functional X{p, K) we find the behavior displayed in Fig. 
Illf B (dashed line), in which we also plotted the results of the numerical simulation of the Gazmuri's algorithm on 
random regular graphs of small sizes. As for ER graphs, the statistics of rare events is extremely well reproduced by 
our theoretical calculations. 

The dynamics of Gazmuri's algorithm is relevant for economic applications, because it reproduces the main features 
of the best-response dynamics in Best-Shot strategic games Q. In the best-response (BR) dynamics, all variables are 
initially assigned to be or 1 with a given probability pi„. At each time step a node i is randomly selected: if at 
least one of the neighbors of i is 1, then the node i is put to 0, otherwise if all neighbors are 0, the node is put to 1. 
This type of dynamics has been recently studied by Lopez-Pintado [l9j on uncorrelated random graphs by means of 
a dynamical mean-field approach. In random regular graphs of degree K, the density of covered nodes (contributors) 
evolves following the equations 



dp{t) 
dt 



= -pm - (1 - p(t))^) + (1 - pum Pit)) 



K 



(40) 



and converges rapidly to the fixed point p = (1 — p)^^. This solution looks different from that obtained by means of 
the pairing process; however, the actual process is not very different. In fact, for the nature of the strategic games 
associated with the mIS problem, a single sweep of BR over the system is sufficient to reach a Nash Equilibrium 
(i.e. to satisfy all variables, without generating contradictions). Therefore, Best- Response dynamics behaves like the 
Gazmuri's algorithm, apart from choice of the initial conditions that could introduce a bias in the density values. 
Simulating BR dynamics with different initial bias we verified that the final density is almost independent of pin 
and agrees reasonably well with the results of Gazmuri's algorithm (not shown). 
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FIG. 13: (Color online) Entropy curves s{p) obtained with BP on both ER (left) and random regular graphs (right). On the 
curves we show the results of the thermodynamic integration by Monte Carlo methods (squares and circles) averaged over at 
least 50 realizations of graphs with A'' = 10'^ nodes. 



B. Monte Carlo methods applied to mIS problem 



1. Different types of simulated annealing 



Monte Carlo algorithms for finding maximal independent sets are based on a simulated annealing scheme for the 
auxiliary binary spin model in which the energy E of the system corresponds to the number of unsatisfied local 
constraints (that we have already defined in Section ITvl) i56j . Starting from the high temperature region (i.e. random 
configurations of Os and Is), we slowly decrease the temperature to zero, with the following Metropolis rule: 1) pick 
up a node randomly and flip its binary variable; 2) if the energy is decreasing, then accept the move with probability 
1, otherwise accept the move with a probability e~^^^. 

In the absence of a constraint on the density of covered nodes, the algorithm always finds a solution of typical 
density. Fig [TU] (diamond-like symbols) shows the dependence of the average density of covered nodes p for the mISs 
obtained with this thermal Monte Carlo as a function of the degree K in both random regular graphs and ER random 
graphs. It is interesting to see that maximal-independent sets found with standard simulated annealing have different 
statistical properties compared to the solutions of greedy algorithms. Checking on the entropy curves obtained with 
BP equations, we see that MC simulations find the thermodynamically relevant solutions that, in the absence of 
chemical potential (/i = 0) are those of typical density that corresponds to the entropy maximum. At a difference 
with Monte Carlo, the Gazmuri's algorithm does not find solutions of typical density but systematically overestimates 
it, finding solutions of slightly larger density of coverings. This phenomenon could be due to the non-equilibrium 
nature of the process and deserves further investigation. 

In order to find solutions in the region of non-typical densities, we consider two main strategies: i) a MC algorithm 
working at fixed number of covered nodes (fMC); ii) a MC algorithm fixing the density by means of a chemical 
potential (gMC). 

In the first case, it is possible to use the following non-local Kawasaki-like move: 1) pick up two nodes at random, 
if they are not both Os or Is, exchange them and compute the variation of energy (number of violated constraints). 
2) Accept the move with usual Metropolis criterion depending on the variation of the energy AE and on the inverse 
temperature /3. Cooling the system from high temperature to zero allows to find mIS at a given density of Is. In 
our simulations performed on graphs of = 10'^ nodes, we are able to find mISs at all densities between a lower 
limit p''fMC upper one pYmc (^^^ Table IIIII) . In Fig. [T^l these values are reported (open triangles) on the RS 

entropy curve for ER (left) and random regular graphs (right). Note that they are quite far from the minimum and 
maximum predicted by cavity methods. 

An alternative approach consists in using a grand-canonical lattice gas formulation, or in terms of spins by the 
addition of an external chemical potential coupled with the density p, i.e. changing the energy E ^ E + pj^i'^i- 
The global optimization of the energy now mixes the attempt to minimize the number of violated constraints with 
that of minimizing (or maximizing depending on the sign of p) the number of covered nodes and requires a careful 
fine tuning of parameters in order to get zero violated constraints at the expected density of covered nodes. A better 
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choice is that of modifying the energy a.s E —i' E + fj,\ J^i ""i ~ ^P* 1 7 with p* being the desired density of covered nodes. 
Apart from the details of implementation, the Monte Carlo dynamics follows the usual thermal criterion: 1) pick up 
a node randomly and flip its binary variable; 2) if the energy is decreasing, then accept the move with probability 1, 
otherwise accept the move with a probability e'^^^ . By fixing /i > we just tune the speed of the convergence of 
the density of Is to the desired value p = p* during the cooling process (increasing values of /3). 

The fact that the number of Is is fixed only on average does not seem to help the system to accomodate the 
configurations more easily than in the fMC case. The results are comparable and, in the low density region, fMC 
seems to perform better (see Fig. [T^ and Table IIIip . 

2. Entropy by Thermodynamic Integration 

Monte Carlo algorithms can be used also to give an estimate of the entropy of solutions by means of the well-known 
thermodynamic integration method [57| . This method, that is commonly used to compute the number of metastable 
states or blocked configurations in granular systems |58| , can be applied to the present problem in a very natural way. 

For a system in the canonical ensemble, we can express the specific heat C as a function of the internal energy E, 
by C(/?) = — and as a function of the entropy S, by C(/3) — — Energy and entropy are therefore related 

by dS(/3) = /3^d/3. Since the energy can be computed numerically using a Monte Carlo algorithm, it is convenient 
to integrate by parts and consider 

5(/3) - 5(/3 = 0) = /3e(/3) - / e{/3')d(3', (41) 

Jo 

where we have used the rescaled quantities s = S/N ans e = E/N. Equation |4T] provides the zero-temperature 
entropy s(oo) once we know e(/3) and the infinite temperature entropy s(0). In our case, the calculation gives an 
estimate of e(/?) and so the entropy of maximal-independent sets on a given graph. Indeed using Monte Carlo we can 
not find very accurate values for the average energy expecially if there exist a phase transition. Even when there is 
no phase transition taking place in the integration range, there are other sources of inaccuracy. More precisely, we 
can investigate a large but finite interval [0,/?maa;], with f3max ^ 00, thus if the Monte Carlo algorithm is not able 
to reach the ground-states at some /? G [0,l3max], the numerical integration can only provide a upper bound for the 
real entropy. In some situations, the two errors may sum up, because replica symmetry breaking also causes slowing 
down in Monte Carlo algorithms. This is the main reason why we cannot push this method up to the extreme values 
of density predicted by theoretical calculations. 

We have used the grand-canonical MC algorithm (gMC) described before to compute the entropy of solutions 
with non typical densities of covered nodes. Note that the integration formula Eq. [41] is correct also for the gMC 
algorithm because it corresponds to a canonical spin system with external field h oc /i//3, that only modifies the 
energetic contribution. At infinite temperature, /3 = 0, the entropy is log(2) because all states are accessible, but for 
large values of the chemical potential p, the system rapidly concentrates around the desired density p as soon as we 
increase /3. The final entropy s{Prnax) gives an estimate of the entropy of mIS with a density p of covered nodes. 
In Fig. [T3]we show the results of thermodynamic integration for both ER and random regular graph with z,K — A 
and compare them with the curves obtained with the cavity approach. We have averaged the entropy values over 
50 realizations of the graphs and the standard deviation of the values are smaller than data symbols. The points 
perfectly agree with the results of the cavity approach showing that, in the range of validity of BP equations they 
correctly predict the statistical properties of maximal-independent sets. 

3. Walking on the space of solutions by "rearrangement Monte Carlo" 

In Section Hill we have seen that it is possible to go from a mIS to another one repeating a simple operation, that 
consists in flipping a variable from to 1 (or viceversa) and rearranging the values of all neighbors iteratively until 
a new mIS is found (see Fig. [2]). It was proved that the operation always involves a flnite number of variables, 
that makes it possible to implement this process inside a Monte Carlo algorithm. Moreover, Proposition [3] ensures 
that, given two mIS configurations, it is always possible to go from one to the other and back with a sequence of 
operations of this kind. The sequence of operations is finite whenever is finite. A Monte Carlo algorithm based on 
this operation is thus expected to be ergodic in the space of all maximal independent sets (see App. [Bland Ref. 0]). 

We define the following MC algorithm: 

0. The initial state is chosen finding a typical mIS by best-response starting from a random configuration. 
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Q. 




FIG. 14: (Color online) Example of the slowing down phenomenon taking place at large chemical potential in the "rearrangement 
Monte Carlo" method explained in Section FV B 31 The curves represent the density of covered nodes in the mlS sampled by the 
rMC algorithm as chemical potential fi slowly increases (lower curve) or decreases (upper curve) in an ER graph of A*' = 10"^ 
nodes and average degree z = 4. 

1. We select a node randomly, we try to flip it and readjust all neighboring nodes propagating the rearrangement 
until all nodes are satisfied. In this way we generate another mIS. 

2. We compute the variation Ap of the density of covered nodes between the two configurations, and accept the 
move with probability 1 if the density decreases and probability e~^^'' if it increases. 

3. We repeat points 1.-3. for a given number of iterations, then we stop or change the chemical potential /i. 

By performing a simulated annealing in which /i is slowly increased (decreased) from 0, we find a chain of maximal 
independent sets with decreasing (increasing) density of covered nodes. 

Having finite rearrangements means that the variations in the number of covered nodes are finite as well and the 
density p is almost constant. Therefore appreciable density fluctuations only occur after 0{N) rearrangements, that 
is a Monte Carlo step. When p. is varied at a sufficiently slow rate, the algorithm should be able to find mISs at all 
densities at which they exist. Fig. [Ml shows some data points taken every 100 MC steps. The density of the mISs 
sampled by the algorithm is reported as function of p. The curves seem to converge to values of the density that are still 
far from the two theoretical bounds (obtained by the cavity method) of the lower and upper SAT/UNSAT transitions. 
At low and large density values, the algorithm is not able to find mIS beyond some threshold p^MC Pbp'^^ > PmAn^ 
and p"a/c^ < p^p"^^ < PmaT- I^ig- HH reports these two values (blue circles) in ER and random regular graphs for 
average degree 4. A direct comparison with other computational bounds shows that this algorithm outperforms the 
other Monte Carlo methods in finding mISs at low and high density of covered nodes. In particular it is much faster 
than the other MC algorithms and provides a large number of mIS at different densities in a reasonably short time. 
The obtained bounds are quite worse than those obtained by BP decimation in the low density phase, but they are 
better in the high-density phase (see Table Ull] and Fig. [T^]) . 

The exact value of the density at which the algorithm stops is only estimated by several numerical experiments 
and further investigation is required in order to understand if some heuristic optimization could allow to reach better 
results, even in the presence of replica symmetry breaking. In fact, at very large chemical potential the algorithm 
becomes sensitive to very small barriers, due to the flip of a finite number of variables. Such barriers do not require 
a change in density, but can trap the algorithm in local minima if the algorithm is running at very large chemical 
potential. If this is true, some heuristic method could be designed in order to improve the performances of the rMC 
algorithm. 

C. BP decimation 

Given an instance of random graph we can run BP equations in Eq. 2] starting from random initial values for 
messages r*^^ . If we reach a fixed point of the equations then the local marginals 
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K 


BP 
Pmin 


BPD 
Pmin 


rMC 
Pmin 


gMC 
Pmin 


"rmn 


fMC 
Hmax 


sMC 
Hmax 


rMC 
Pmax 


BPD 
Pmax 


BP 
Pmax 


3 


0.264 


0.267 


0.269 


0.275 


0.271 


0.449 


0.449 


0.449 


0.449 


0.458 


4 


0.223 


0.228 


0.230 


0.235 


0.232 


0.408 


0.408 


0.410 


0.408 


0.419 


5 


0.196 


0.201 


0.203 


0.206 


0.206 


0.377 


0.377 


0.379 


0.375 


0.387 


6 


0.175 


0.181 


0.185 


0.188 


0.184 


0.349 


0.349 


0.349 


0.348 


0.360 


7 


0.159 


0.165 


0.168 


0.172 


0.169 


0.327 


0.327 


0.328 


0.325 


0.338 


8 


0.146 


0.153 


0.156 


0.158 


0.158 


0.308 


0.310 


0.308 


0.306 


0.319 


9 


0.136 


0.143 


0.145 


0.148 


0.146 


0.294 


0.293 


0.294 


0.289 


0.301 


10 


0.127 


0.134 


0.138 


0.140 


0.138 


0.278 


0.278 


0.278 


0.274 


0.287 



TABLE III: Summary of the minimum and maximum values of density at which we find mISs on random regular graphs with 
different algorithms: BP decimation Pmax' ', fixed-density Monte Carlo pl^'^ , pmax', grand-canonical Monte Carlo p^'^ , 

Pmin' rearrangement Monte Carlo pmin , Pmin ■ BP results are obtained on graphs of size A'' = 10'' whereas Monte Carlo 
results on graphs of size iV = 10^ In the cases of fMC and gMC, we have chosen the values at which at least half of the runs 
were successful in finding a mIS. 



(42) 



will give us the approximate probability of = 1 among the set of mIS's. One strategy of finding a mIS is to 
decimate the most biased variables according to their preference. Suppose that in the first run of algorithm variable 
i has the maximum bias |1 — 2hi\ among the variables. Then if, for example, &i > 1/2 vife fix = 1 and reduce the 
problem to a simpler one with smaller number of variables. The strategy in BP decimation algorithm [S] is to iterate 
the above procedure till we find a configuration of variables that satisfies all the constraints. Certainly if the believes 
bi that we obtain are exact the algorithm would end up with a mIS, if there exist any. Otherwise at some point we 
would find contradictions signaling the wrong decimation of variables in previous steps. 

The results of BP decimation algorithm have been summarized in Table IIIIl For the case if , z = 4, the minimum 
and maximum density at which we are able to find a mIS for graphs of size N = 10^ are also reported as black full 
circles in Figure [T^ 

Notice that similarly one can use a SP decimation algorithm based on SP equations, to find a solution (here a mIS). 
This is usually more useful than BP decimation in problems that exhibit a well clustered solution space. In the case 
of maximal independent sets we did not observe a significant difference in the performance of the two algorithms. 
This is the reason why here we focus on the BP decimation algorithm which is more accessible. 



VI. CONCLUSIONS AND OUTLOOK 



In this paper we have investigated the statistical properties of maximal independent sets, a graph theoretic quan- 
tity that plays a central role both in combinatorial optimization and in game theory. Among the most prominent 
applications it is worth mentioning the development of distributed algorithms for radio networks [sj and the study of 
public goods allocation in economics 0- 

A long-standing problem in combinatorial optimization is to estimate the number of maximal independent sets in 
a given graph, and devise efficient algorithms to find them, independently of their size. In the first part of the paper 
we have focused on some theoretical methods to compute the number of mISs of size M in random graphs of size 
TV. As in general this number is exponentially large Nmis ~ gNs(M/N) ^ have used statistical mechanics methods 
to compute the entropy s{p) of maximal independent sets as a function of the density p = M/N of coverings and of 
the average degree of the graphs. At typical density values, the RS approximation (BP equations) describes correctly 
the system. While the BP equations remain stable in the low density region, for high density of coverings the BP 
equations become unstable and the RS solution does not hold anymore. 

The general IRSB calculations for random regular graphs show a dynamical transition in a small interval of density 
very close to the minimum density. However, the population dynamics algorithm hardly converges to this solution. In 
the high density region we observe a condensation transition to IRSB phase. Here the solution has better convergence 
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than the IRSB solution for low densities. Previous studies, for instance in 3-SAT problem [i^, show that these 
solutions suffer from another kind of instability. A more detailed study of stability of IRSB solutions in this problem 
remains to check for future works. 

From the computational point of view, the main issue is to find maximal independent sets at very low or very 
high density, within the bounds indicated by RS and first-moment calculations. Greedy algorithms, like Gazmuri's 
one, can find mISs at very typical values of the density, whereas Monte Carlo methods and BP decimation can be 
used to explore regions of non-typical values. Our numerical calculations indicate that BP decimation gives the best 
performances, but the values at which we find solutions are still far from the theoretically predicted bounds. Notice 
that despite the dynamical transition in the low density region, the absence of globally frozen variables could make 
the problem easy on average in that region [i^ [5l| . 

We expect that the results we obtained here could be exploited to design more efficient algorithms. A first example 
is represented by the rearrangement Monte Carlo algorithm, that allows to move among the space of mISs sampling 
them with a density-dependent Gibbs measure. Apart from dramatic slowing down faced by MC algorithms in 
presence of RSB, the algorithm should be able in principle to reach all existing mISs. 

A maximal independent set on a graph Q can be viewed as a saturated packing of hard spheres of diameter d — 2. 
So the minimum and maximum mIS densities define the region one can have saturated packings and in this paper we 
gave these limits in the RS approximation. It would be interesting to see how computational and physical properties 
of mISs change by increasing diameter d. 
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Appendix A: Annealed calculations and bounds in random graphs 

We compute here some rigorous mathematical results on the number of maximal independent sets with a given 
density p of covered nodes. The first moment method allows to give lower bounds and upper bounds p^^^^ for 

the density of covered nodes in a mIS on random graph. 

Let Xm denote the number of mISs of size M in a graph Q of size N, from the Markov inequality we have 

Proh(XM > 0) < Xm- (A1) 

where Xm is the average number of mISs in the ensemble of graphs of size N to which Q belongs. If for some values 
oi M < N the average number of maximal independent sets of size M becomes zero, then Markov inequality implies 
that the probability to find a mIS of size M also vanishes. 

In the Erdos-Renyi ensemble of random graphs G{N,p), the average number of maximal independent sets of size 
M is given by ^ 

Xm = -I')^!! - H-Prr-^'. (A2) 

For large N and AI with fixed p = M /N and p — z/N, the asymptotic behavior of the number of mIS is Xm ~ e^^^^''^ 
where 

Slip) = -pHp) - (1 - p) ln(l - P) - Izp' + {1-P) ln(l - e"^''). (A3) 

The density values at which the entropy si becomes negative give bounds for the existence of maximal independent 
sets. Therefore pl^Yn^ is a lower bound for the density of occupied nodes in the minimum mIS (mis) and p^^^^ is an 
upper bound for the density of occupied nodes in the maximum mIS (MIS). The values of /Omirf ^^'^ PmST ER 
random graphs with several values of the average degree z are reported in the first and last column of Table ID 

The first moment calculation can be extended to random regular graphs (RRG), i.e. graphs in which connections 
are established in a completely random way with the only constraint that all nodes have the same finite degree K (we 
will consider diluted networks, i.e. K ^ N) 
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It gives 



f^lll-P^l^ll-d-rt"]'-"" (A4) 



expN 



^pHp) - (1 - p) ln(l -p)- ^p" + {l~p) log [1 - (1 - p)^] 



(A5) 



As before, extracting the zeros of the entropy function for various degrees K, we obtain the values of p''^^^ and Pmai'^ 
that are reported in Table |lTl As we will verify later comparing these results with those from the cavity method, only 
the lower bound is tight, whereas the upper one strongly overestimates the existence of mISs. However, the entropy 
values at typical density of coverings (e.g. the maximum of the entropy) are in agreement with other theoretical and 
numerical results, corroborating the validity of this improved annealed calculation. 

The annealed approximation we have just discussed only gives an upper bound for the real number of mISs, therefore 
it would be important to have also a lower bound for the real entropy curve s{p). This is usually obtained by the 
second moment method. Let X\j be the second moment of the number of mISs of size M in a graph of size N , the 
Chebyshev's inequality provides a lower bound for the probability of finding a mIS of size M, 

PTob{XM > 0) > (A6) 
For ER random graphs, the second moment is 

where I is the overlap between the two configurations corresponding to mISs of size M . In the scaling limit, p = M/N 

and X = l/N 

Xj^^N r dxe^''^P'''\ (A8) 
Jq 

Let us call x* the value of the overlap maximizing S2{p,x). The density values where 2si{p) — S2{p,x*) vanishes 
should give the lower bound p\'^^^ for the density of the MIS and the upper bound p^lri^ foi' the density of the mis. 
Unfortunately, for ER random graphs, S2{p,x*) is always larger than 2si{p), meaning that the ratio x\,i/Xf,j always 
vanishes in the thermodynamic limit. The corresponding trivial result Prob{XM > 0) > does not say anything 
about the extremal densities for MIS and mis. The same holds for the ensemble of random regular graphs. 



Appendix B: Proof of the results of Section IIIII 

Here are the proofs of the result shown in Section IIIII from which we maintain the notation. Consider a finite 
network and call ai e {0, 1} the membership of node i to a set T. It is clear that there is a one-to-one correspondence 
between any subset of the nodes and any vector g_. We will consider those g_ for which I is a mIS in a given graph Q. 
Call finally Nl the set of nodes which are first neighbors of node i, and Nf those which are second neighbors of node 
i. By definition of mIS, we have that X is a mIS if and only if g_ is such that 

<7i — 1 if fjj = for any neighbor j of node i; (Bl) 
(Ti = otherwise. 

Equation IB 1 1 defines what is the best response dynamics. For any node i there is always one strict best response given 
any memberships' configuration of its neighbors in Nl . This would hold 'a fortiori' also in a mIS configuration, and 
hence proves that u_ is locally frozen (Proposition [Ij . 

Proposition [5] tells us that the best response rule will imply a new mIS, and that any best response dynamics of 
the other nodes will be limited to the second degree neighborhood of the node which initially flipped. 

Proof of Proposition [2l suppose node i is in the mIS, so that tXi = 1, and we remove it so that cr"™ = 0. 
Consider now any node j in N} , it is clear that ctj = since 0-^ = 1. By best response, for all those j G N} such that 



26 



CT/c = for any k € Nj\{i}, we will have a'j^'^ = 1. In the case that two such j's that flipped from to 1 will be linked 
together, by best response only some of them will flip to 1 (this is the only random part in the best response rule). 
If j is such that <Tj — and cr""^"' = 1, it is surely the case that any k G Nj\{i} was playing ak = and remains at 
cr^'^™ = 0. Finally, if no neighbors j £ Ni flip from to 1, we will allow node i to turn back to its original position. 
The propagation of the best response is then limited to N} U {i} (and ends in an mIS, possibly the old one). 

Note: a best response from to 1 applies only to nodes that are 0, are linked to a node which is flipping from 1 to 
0, and that node is the only neighbor they have who is originally 1. 

Suppose now that cr^ = and we flip it so that cr"^™ — 1. The nodes j in Nl who had aj = will continue to do 
so. Any node j in N} (at least one) who had <jj = 1 will move to cr'^™ = 0. By the previous point this will create a 
propagation to some k G Nj, but not i. This proves that the propagation of the best response is limited to Nl U Nf 
(and ends in a new mIS). □ 

Finally, we prove that any mIS can be reached in finite steps, by best response dynamics, from any other mIS 
(Proposition [3l). 

Proof of Proposition [Sj we proceed by defining intermediate mIS a}, o;^, . . . between any two mIS a and a' 
(associated to I and I'). will be obtained from ct" by flipping one node from to 1 and waiting for the best 

response of all the others. 

If two mIS a and a' are different, it must be that there is at least one ii such that (7^^ = and tr^^ — 1 (by definitions 
any strict subset of a maximal independent set is not a covering any more). Change the membership of that node so 
that a}^ — = 1. By previous proof this will propagate deterministically to N}^ and, for all j G N}^, we will have 
(jj = (7^ = 0. Propagation may also affect Nf_^ but this is of no importance for our purposes. 

If still a} 7^ o;', then take another node ^2 such that a}^ = and ct,-^ = 1 (12 is clearly not a member of Nl_^ U {ii}). 
Pose af^ = o--^ = 1' this wiU change some other nodes by best response, but not j G Nl_^ U {ii}, because any j G Nl_^ 
can rely on a}^ — 1, and then also af^ — a}^ = 1 is fixed. 

We can go on as long as ct" ^ ct', taking any node in+i for which crj^^^ = and cr[^^^_^ — 1. This process will 
converge to a;" — > a;' in a finite number of steps because: 

• when in+i shifts from to I, the nodes j G lJh=i (-^ih i^hY) will not change, since they are either 0-nodes 
with a 1-node beside already (the l~node is some ifn with /i < n), or a 1 (some ih) surrounded by frozen O's; 

• by construction it is never the case that i„+i G U/^=i {^h\), because for all j G U/I=i ^ {^h}) we 
have that cr" ~ a'^ ; 

• the network is finite. □ 



The shift from a_ to ct' is done by construction re-defining the covering of any ct" from the covering of ct'. It is 
always certain that, by best response, any ct" is also an independent set. 



Appendix C: Survey Propagation 

Survey Propagation (SP) j43l | allows to study the IRSB phase (if any) in a simplifying limit where we have only 
one parameter y = m/i. The corresponding equations are called SP equations and provide the behavior of the m — Q 
complexity S(/9). 

If the solutions are clustered, the believes {r^^^'' , t'q^"' , r\^'' } are not distributed in the same way from cluster to 
cluster (different Gibbs states), therefore we have to introduce a distribution of cavity fields. In writing SP equations 
we assume that these distributions are 'P[r'^-'] ~ VaSir^^^ — 1) HbTta '^(''b^"') ^oi' a,b = 0, 00, 1. Cavity equations 
thus translate into equations for the surveys {r/o,r]oo,r/i}. 

In the limit — > — c» they read 

f-(f-ryi)^-i 

'70 - 



l + {e-y -l){l-fii)K-i' 
Voo = 0, 
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FIG. 15; (Color online) Comparing IRSB complexity for m — (dashed line) with the BP entropy (solid line), 
where is the penalty favoring clusters (solutions) with higher density. The density of occupied nodes is 



In the Bethe approximation 



e-y{i-m) 



K 



l + (e-!'-l)(l-ryi)^- 



K 



(C2) 



(C3) 



with 



2?7i?7o, 



(C4) 



Solving numerically the equations we find that the complexity is always a bit larger than the BP entropy, that is an 
unphysical result. Figure [T5] compares the complexity with the RS entropy for if = 3. The same behavior is observed 
for larger degrees. Indeed, looking at the stability of this first set of SP equations (details are reported in Appendix 
ID|) , it turns out that the SP equations ICll are not stable in the whole large density region. 

In the other limit /_* — +oo, wc find the equations 



T]i oc [(1 - rji) 
rja oc (1 - r/oo) 



K-l 



K-l 



4-' 



while the generalized thermodynamic potential is given by the same expression but with 



e-ni-7?i)^ + (l 
2?7i(»7o + »7oo), 



(C5) 
(C6) 
(C7) 

(C8) 



with density p obtained as 



K 



e-2'(l"r;i)^ + (l-77oo) 



K 



(C9) 



In Appendix |D] we show that the SP solution is stable in the whole low-density region. However as in the high 
density region we find a complexity that is a bit larger than the RS entropy, see figure 1151 

As expected, m = is not the correct approximation to describe our system in the IRSB phase. 
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FIG. 16: (Color online) Checking stability of BP equations: typical behaviour of A (solid line). Vertical dashed lines represent 
the minimum and maximum densities predicted by BP equations. 



Appendix D: Stability Analysis 
1. RS stability 

The BP equations for the nilS problem involve three cavity fields r*^-' with a = 1,0, 00, therefore in order to check 
the stability of the RS solution we have to compute the response induced in these fields by a small perturbation in 

the neighboring cavity fields The elements of the stability matrix are obtained as Ma t — t^t^i ; 



e-/^(l - r-i)^"2 



K-2 



72 



Moa = [(1-roo) 

Mo,o = --^ 
Mo,oo 

^'^"00,00 = (1 - ^'oo) 



K-l 



„K-r 



e-^(l-ri) 



K-2 



(l--oo)^-Vn . ^A--2(l--oo)^-^ 
+ (1 - roo) 



„K-1 



'0 




Zm 

ie-''(l-ri)^-2 



72 



72 



K-2 ' 



72 



\K-1 



(Dl) 



(D2) 



(D3) 



with Zm = e-^{l - ri)^-i + (1 - r^^Y 

If Am is the largest eigenvalue of M, computed at the fixed point of the BP equations, then the (spin-glass) stability 
condition reads A = {K — 1)\\[ < 1. If A is larger than 1, then the perturbation is amplified by iteration and the 
RS solutions are unstable fixed points of the BP equations. Figure [T7] shows how A changes with density in random 
regular graphs of degree K — i. The same behavior is observed for other degrees. 
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2. SP stability 

Suppose that, according to IRSB picture, solutions are organized in a large number of clusters that represent 

different Gibbs pure states. There are two kinds of possible instabilities: a) states can aggregate into different 
clusters, or b) each state can fragment in different states. We study here if this can occur for the mIS problem in the 
two limits defining the SP approximation. 



a. Limit /u — ^ +00, m — ^ (y = mfj,) 



The first kind of instability is related to the divergence of the inter-cluster spin-glass susceptibility, i.e. the instability 
of SP fixed points on single graphs. Hence the calculation is equivalent to the RS case. We need the 3x3 matrix 

Mab = Pf^, where a = 1, 0, 00. We have 

M,, = - --''^-"■'"-^ + "'«^-"|:--''"' [.-'(l-.0--l, (D4) 

Mi,o = + ^2 [e \o J, 

^1,00 - ^ [(l-%oJ J, 

Mo,i = ^^-'^''^l~'-'^^~\ e-y{i-mf-% (D5) 

Mofl ^ 1 ^2 % J' 

-MO.OO ^ \ ^2 [(l-%Oj J, 

K-1 K-1 

-Moo,o - -^^ h -^[e "rjo J, 

„K-1 

''0 r/1 ^^-2l 



with 



Moo.oo = -^[(l-??oo) 



= e-y[{l - r;i)^-i - ,70^-1] + (1 - 7700)^-^ (D6) 
If Am is the dominant eigenvalue of M then the first kind stability condition reads 



A, = {K~ l)Xi, < 1. (D7) 

The second kind instability is instead related to intra-cluster susceptibility, and can be studied by means of "bug 
proliferation" defined on clusters. This means that we consider how a change in a single cavity field from, say, a to 
b is propagated to neighbors and how this reflects on the distribution of surveys inside the cluster. If the instability 
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"bug" is Aq-j.;,, we need the transfer matrix Tab,cd satisfying to \a^b = '^cd'^o-b,cd^c^d- The iterative equations are 



??iAi^o = ^[(X-l)?7^-^ooAoo^i], (D8) 



f?iAi^oo = 17— [(^-1)%' ^'7ooAoo^o], 



?7oAo^oo = -^[{K-l)r]Q ^??iAi^o], 

e-y 

??OoAoo^O ~l)Vo^^>^Q^l]■ 
If At is the dominant eigenvalue of T then the second kind stability condition is A2 = At < 1- In the present case, 
Y = ^ {K — 1)7]^ and the stability of SP solution is satisfied in the whole low-density region, see Figure [T71 

b. Ltmit jj. — >■ —00, m — )■ (y = mjj.) 
In the high-density limit, equations simplify because ryoo = always and the matrix M reduces to the element 



e-y{i ~ e-y{e-y - i)(i - 771)2^-3 

with 



dvi = - ' + -0-^ • (D9) 



ZM = l + {e-y -m-mr-'. (DIG) 



So for the first kind instability, the stable region satisfies Ai = {K — l){drii)^ < 1, 
Similarly, equations for bug proliferation reduce to 



r/iAi^o = ^(l-7?i)^'-i[(i^-l)Ao^i], (DU) 



-y/2 



SO to have second kind stability we need A2 = (1 — fji)^ ^ < 1. Numerical solutions show that the SP 

solution is unstable in the whole large-density region, see Figure [T71 

Appendix E: Population dynamics 

In the paper we had to use population dynamics two times; in the RS study of ER random graphs Eq. IIV A 21 
and in IRSB study of random regular graphs Eq. IIVBI Population dynamics is a way of solving these equations by 
representing a probability distribution with a large population of variables. Let us describe how we solve the equation 
in the IRSB case which is similar but more general than Eq. IIV A 21 This is the equation 




We define a population of size Np with elements r", a = 1, . . . , Np. Each rf is a probability vector and frequency of 
a vector r in the population gives an estimate of 'P(r). 

We start by a random initial population and in each step we update the population in the following way: 

• select randomly members ai, . . . ,aK-i from the population, here K is node degree. 
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FIG. 17: (Color online) Checking stability of SP equations: typical behaviour of Ai (upper picture) and A2 (lower picture) in 
the low density (solid line) and large density (dashed line) region. Vertical dashed lines represent the minimum and maximum 
densities predicted by BP equations. 



• use . . . ,r°^-i to find r"-™ and A f cavity according to BP equations, 

• with probability cx e^v^f^avity replace a random member of population with r"*^™, 

After sufficiently large number iterations tfie population reaches a stationary state that can be used to obtain the 
interesting quantities. For instance the generalized free energy $ is given by 



K 



where the averages are taken over the population 



(E2) 



(E3) 



Derivatives of $ define the other average values. 
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FIG. 18: (Color online) Comparing entropy of the packing (left curve), the covering (right curve) and mIS problems (middle 
curve) on random regular graphs of degree K — 3. 

Appendix F: Two sub-problems of the mIS problem 

In this Appendix we analyze the effects of the two constraints acting on the nodes of the graph separately. 

1. Packing constraint 

In this case we have only packing constraints Iij{ai,aj) which are satisfied if = V cTj — 0. Then BP equations 
are 



<^keai\j kedi\j 

In random regular graphs (degree K) we take z/i_^j(l) = r and the BP equations simplify into 



(Fl) 



e-^(l -r) 



K-l 



Then the free energy reads fif ~ fJ-Afi — ^iiAfij — fip ~ s, 



l + e-''(l-r)^-i 
J ^ fip- s, 



(F2) 



(F3) 



and p - 



2. Hyper-covering constraint 

The covering constraints Ii{(Ti, agi) are satisfied if ai + J^j^Oi '^i ^ ^- ^^'^ equations read 



Vi^j{(Ti,(Ji^j) cc'^ /fc((Tfe,crafe)l/fe^i(crfe,crfe^i)e" 



-ti<Ji 



(F4) 
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In fixed degree graph we can define the cavity fields as for the full problem {r\~^'' , Tq"*^"' , ^'q^-' } and we get 

^1 = -T-^f^^-TTfTT. (F5) 



ro = 





+ (1- 




(1- 




-1 _„K-1 
'0 




+ (1- 






'o 


-1 


g-p 


+ (1- 





The free energy contributions are 



=e-^ + (l-roo)^-ro^, (F6) 
-mA/.. =rl+rl + 2ri{ro + roo), 



and p = _„,,., ^ — — 3r . 

The resulting curves for the BP entropy are displayed in the representative case of = 3 in Fig. 1181 The curve 
for the packing problem is close to that of maximal-independent sets for large density, whereas the hyper- covering (or 
conjugate lattice glass) curve does the same for low densities. Together, the two curves define an envelope that gives 
an upper bound for the BP entropy of the mIS's problem. 



Appendix G: Higher-order independent sets 



Higher order maximal-independent sets are important for applications in computer science and economics. In 
particular, maximal- independent sets of order 2 are the specialized Nash equilibria of the continuous model of public 
goods game proposed by Bramoulle and Kranton in Ref . [6] . 

In our CSP, a mIS of order n is obtained just by imposing the condition that each empty node has at least n 
occupied neighbors. For n = 1 we recover the original mIS definition. In general we write BP equations for the cavity 
fields r\~^^ = Rl~^^, r>7/-2 = Em>„-2 K~Z^ ^^'^ ''>^-i = Em>«-i ^o^^- On random regular graphs, they read 

n cx e-'V^-i^, (Gl) 

K-l 

r>„_2 cx 

l=n-l 
K-l 

r>„_i cx ^ 

l—n 

Then the free energy reads 



K 



K 

E 



'>n-2 • A^yi 
l—n 



K 



l' >n-l' 



(G2) 



and 



P = e-'V«„„2e''"^'- (G3) 

In random regular graphs a maximal independent set of order n = 2 appears for the first time at ii' = 5. For larger 
degree values, the BP entropy is shown in Fig. 1191 It is worthy noting that the maximum entropy value increases for 
larger degrees, in contrast to what happens in the n = I case. 
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